AITopics | Hradec Králové Region

Collaborating Authors

Hradec Králové Region

BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction

Kohút, Jan, Dočekal, Martin, Hradiš, Michal, Vaško, Marek

arXiv.org Artificial IntelligenceMar-25-2025

Manual digitization of bibliographic metadata is time consuming and labor intensive, especially for historical and real-world archives with highly variable formatting across documents. Despite advances in machine learning, the absence of dedicated datasets for metadata extraction hinders automation. To address this gap, we introduce BiblioPage, a dataset of scanned title pages annotated with structured bibliographic metadata. The dataset consists of approximately 2,000 monograph title pages collected from 14 Czech libraries, spanning a wide range of publication periods, typographic styles, and layout structures. Each title page is annotated with 16 bibliographic attributes, including title, contributors, and publication metadata, along with precise positional information in the form of bounding boxes. To extract structured information from this dataset, we evaluated object detection models such as YOLO and DETR combined with transformer-based OCR, achieving a maximum mAP of 52 and an F1 score of 59. Additionally, we assess the performance of various visual large language models, including LlamA 3.2-Vision and GPT-4o, with the best model reaching an F1 score of 67. BiblioPage serves as a real-world benchmark for bibliographic metadata extraction, contributing to document understanding, document question answering, and document information extraction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.19658

Country:

Europe > Czechia > South Moravian Region > Brno (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs

Pan, Haowen, Wang, Xiaozhi, Cao, Yixin, Shi, Zenglin, Yang, Xun, Li, Juanzi, Wang, Meng

arXiv.org Artificial IntelligenceMar-17-2025

Knowledge editing aims to update outdated information in Large Language Models (LLMs). A representative line of study is locate-then-edit methods, which typically employ causal tracing to identify the modules responsible for recalling factual knowledge about entities. However, we find these methods are often sensitive only to changes in the subject entity, leaving them less effective at adapting to changes in relations. This limitation results in poor editing locality, which can lead to the persistence of irrelevant or inaccurate facts, ultimately compromising the reliability of LLMs. We believe this issue arises from the insufficient precision of knowledge localization. To address this, we propose a Fine-grained Neuron-level Knowledge Editing (FiNE) method that enhances editing locality without affecting overall success rates. By precisely identifying and modifying specific neurons within feed-forward networks, FiNE significantly improves knowledge localization and editing. Quantitative experiments demonstrate that FiNE efficiently achieves better overall performance compared to existing techniques, providing new insights into the localization and modification of knowledge within LLMs. Recently, various methods for the precise editing of outdated or wrong knowledge within Large Language Models (LLMs) (Touvron et al., 2023a;b; Jiang et al., 2024; Dubey et al., 2024) have been proposed (Mazzia et al., 2023; Yao et al., 2023; Wang et al., 2023). This paper primarily focuses on locate-then-edit methods, which have emerged as a promising and mainstream approach for knowledge editing in LLMs. A key representative of these approaches is ROME (Meng et al., 2022), which employs causal tracing to identify specific modules responsible for recalling facts about subject entities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.0109

Country:

Asia > Russia (0.15)
Europe > United Kingdom (0.15)
Europe > Italy (0.05)
(29 more...)

Genre: Research Report (1.00)

Industry: Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

On the Empirical Complexity of Reasoning and Planning in LLMs

Kang, Liwei, Zhao, Zirui, Hsu, David, Lee, Wee Sun

arXiv.org Artificial IntelligenceJun-17-2024

Chain-of-thought (CoT), tree-of-thought (ToT), and related techniques work surprisingly well in practice for some complex reasoning tasks with Large Language Models (LLMs), but why? This work seeks the underlying reasons by conducting experimental case studies and linking the performance benefits to well-established sample and computational complexity principles in machine learning. We experimented with 6 reasoning tasks, ranging from grade school math, air travel planning, ..., to Blocksworld. The results suggest that (i) both CoT and ToT benefit significantly from task decomposition, which breaks a complex reasoning task into a sequence of steps with low sample complexity and explicitly outlines the reasoning structure, and (ii) for computationally hard reasoning tasks, the more sophisticated tree structure of ToT outperforms the linear structure of CoT. These findings provide useful guidelines for the use of LLM in solving reasoning tasks in practice.

blue block, item, orange block, (16 more...)

arXiv.org Artificial Intelligence

2404.11041

Country:

North America > United States > New York (0.05)
North America > Guatemala > Guatemala > Guatemala City (0.05)
North America > Honduras > Francisco Morazán > Tegucigalpa (0.05)
(35 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Consumer Products & Services > Travel (0.34)
Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Critical Review for One-class Classification: recent advances and the reality behind them

Hayashi, Toshitaka, Cimr, Dalibor, Fujita, Hamido, Cimler, Richard

arXiv.org Artificial IntelligenceApr-27-2024

This paper offers a comprehensive review of one-class classification (OCC), examining the technologies and methodologies employed in its implementation. It delves into various approaches utilized for OCC across diverse data types, such as feature data, image, video, time series, and others. Through a systematic review, this paper synthesizes promi-nent strategies used in OCC from its inception to its current advance-ments, with a particular emphasis on the promising application. Moreo-ver, the article criticizes the state-of-the-art (SOTA) image anomaly de-tection (AD) algorithms dominating one-class experiments. These algo-rithms include outlier exposure (binary classification) and pretrained model (multi-class classification), conflicting with the fundamental con-cept of learning from one class. Our investigation reveals that the top nine algorithms for one-class CIFAR10 benchmark are not OCC. We ar-gue that binary/multi-class classification algorithms should not be com-pared with OCC.

algorithm, classification, detection, (12 more...)

arXiv.org Artificial Intelligence

2404.17931

Country:

Europe > Czechia > Hradec Králové Region > Hradec Králové (0.04)
North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > Ohio > Greene County > Fairborn (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.78)
(2 more...)

Add feedback

Decentralized Coordination of Distributed Energy Resources through Local Energy Markets and Deep Reinforcement Learning

May, Daniel, Taylor, Matthew, Musilek, Petr

arXiv.org Artificial IntelligenceApr-19-2024

As the energy landscape evolves toward sustainability, the accelerating integration of distributed energy resources poses challenges to the operability and reliability of the electricity grid. One significant aspect of this issue is the notable increase in net load variability at the grid edge. Transactive energy, implemented through local energy markets, has recently garnered attention as a promising solution to address the grid challenges in the form of decentralized, indirect demand response on a community level. Given the nature of these challenges, model-free control approaches, such as deep reinforcement learning, show promise for the decentralized automation of participation within this context. Existing studies at the intersection of transactive energy and model-free control primarily focus on socioeconomic and self-consumption metrics, overlooking the crucial goal of reducing community-level net load variability. This study addresses this gap by training a set of deep reinforcement learning agents to automate end-user participation in ALEX, an economy-driven local energy market. In this setting, agents do not share information and only prioritize individual bill optimization. The study unveils a clear correlation between bill reduction and reduced net load variability in this setup. The impact on net load variability is assessed over various time horizons using metrics such as ramping rate, daily and monthly load factor, as well as daily average and total peak export and import on an open-source dataset. Agents are then benchmarked against several baselines, with their performance levels showing promising results, approaching those of a near-optimal dynamic programming benchmark.

agent, drl agent, variability, (15 more...)

arXiv.org Artificial Intelligence

2404.13142

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.14)
Europe > Czechia > Hradec Králové Region > Hradec Králové (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy > Power Industry (1.00)
Government > Regional Government > North America Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Transactive Local Energy Markets Enable Community-Level Resource Coordination Using Individual Rewards

May, Daniel C., Musilek, Petr

arXiv.org Artificial IntelligenceMar-22-2024

ALEX (Autonomous Local Energy eXchange) is an economy-driven, transactive local energy market where each participating building is represented by a rational agent. Relying solely on building-level information, this agent minimizes its electricity bill by automating distributed energy resource utilization and trading. This study examines ALEX's capabilities to align participant and grid-stakeholder interests and assesses ALEX's impact on short- and long-term intermittence using a set of community net-load metrics, such as ramping rate, load factor, and peak load. The policies for ALEX's rational agents are generated using dynamic programming through value iteration in conjunction with iterative best response. This facilitates comparing ALEX and a benchmark energy management system, which optimizes building-level self-consumption, ramping rate, and peak net load. Simulations are performed using the open-source CityLearn2022 dataset to provide a pathway for benchmarking by future studies. The experiments demonstrate that ALEX enables the coordination of distributed energy resources across the community. Remarkably, this community-level coordination occurs even though the system is populated by agents who only access building-level information and selfishly maximize their own relative profit. Compared to the benchmark energy management system, ALEX improves across all metrics.

agent, hypothesis, scenario, (17 more...)

arXiv.org Artificial Intelligence

2403.15617

Country:

North America > Canada > Alberta (0.14)
Europe > Czechia > Hradec Králové Region > Hradec Králové (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.46)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CsFEVER and CTKFacts: Acquiring Czech data for fact verification

Ullrich, Herbert, Drchal, Jan, Rýpar, Martin, Vincourová, Hana, Moravec, Václav

arXiv.org Artificial IntelligenceOct-17-2022

In this paper, we examine several methods of acquiring Czech data for automated fact-checking, which is a task commonly modeled as a classification of textual claim veracity w.r.t. a corpus of trusted ground truths. We attempt to collect sets of data in form of a factual claim, evidence within the ground truth corpus, and its veracity label (supported, refuted or not enough info). As a first attempt, we generate a Czech version of the large-scale FEVER dataset built on top of Wikipedia corpus. We take a hybrid approach of machine translation and document alignment; the approach and the tools we provide can be easily applied to other languages. We discuss its weaknesses and inaccuracies, propose a future approach for their cleaning and publish the 127k resulting translations, as well as a version of such dataset reliably applicable for the Natural Language Inference task - the CsFEVER-NLI. Furthermore, we collect a novel dataset of 3,097 claims, which is annotated using the corpus of 2.2M articles of Czech News Agency. We present its extended annotation methodology based on the FEVER approach, and, as the underlying corpus is kept a trade secret, we also publish a standalone version of the dataset for the task of Natural Language Inference we call CTKFactsNLI. We analyze both acquired datasets for spurious cues - annotation patterns leading to model overfitting. CTKFacts is further examined for inter-annotator agreement, thoroughly cleaned, and a typology of common annotator errors is extracted. Finally, we provide baseline models for all stages of the fact-checking pipeline and publish the NLI datasets, as well as our annotation platform and other experimental data.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10579-023-09654-3

2201.11115

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Czechia > Prague (0.05)
Europe > Germany (0.04)
(18 more...)

Genre: Research Report > New Finding (0.92)

Industry: Media > News (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Recognition of Patient Groups with Sleep Related Disorders using Bio-signal Processing and Deep Learning

Jarchi, Delaram, Andreu-Perez, Javier, Kiani, Mehrin, Vysata, Oldrich, Kuchynka, Jiri, Prochazka, Ales, Sane, Saeid

arXiv.org Artificial IntelligenceNov-10-2021

Accurately diagnosing sleep disorders is essential for clinical assessments and treatments. Polysomnography (PSG) has long been used for detection of various sleep disorders. In this research, electrocardiography (ECG) and electromayography (EMG) have been used for recognition of breathing and movement-related sleep disorders. Bio-signal processing has been performed by extracting EMG features exploiting entropy and statistical moments, in addition to developing an iterative pulse peak detection algorithm using synchrosqueezed wavelet transform (SSWT) for reliable extraction of heart rate and breathing-related features from ECG. A deep learning framework has been designed to incorporate EMG and ECG features. The framework has been used to classify four groups: healthy subjects, patients with obstructive sleep apnea (OSA), patients with restless leg syndrome (RLS) and patients with both OSA and RLS. The proposed deep learning framework produced a mean accuracy of 72% and weighted F1 score of 0.57 across subjects for our formulated four-class problem.

algorithm, frequency, sensor 2020, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/s20092594

2111.05917

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Czechia > Prague (0.05)
Europe > Czechia > Hradec Králové Region > Hradec Králové (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Public sector procurement of AI across Europe

#artificialintelligenceOct-7-2019, 07:29:52 GMT

What is happening in Europe regarding Artificial Intelligence (AI) and Machine Learning (ML)? "A lot" is the short answer. There is a Cambrian explosion of projects, studies and initiatives in this field. In this post we provide an overview of what is happening in Europe regarding Artificial Intelligence (AI) and Machine Learning (ML). We look some of the latest examples from across Europe where public sector buyers are looking for suppliers of Artificial Intelligence, Machine Learning and Big Data systems, products and services.

artificial intelligence, contract, europe, (9 more...)

#artificialintelligence

Country:

Europe > United Kingdom > Scotland (0.06)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
Europe > Germany (0.05)
Europe > Czechia > Hradec Králové Region > Hradec Králové (0.05)

Industry:

Information Technology > Security & Privacy (0.98)
Government > Regional Government > Europe Government (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback